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ABSTRACT:  We  have  developed  a  family  of  quantitative  descriptors  in  order  to  provide  non- 
invasive,  reliable  means  of  distinguishing  benign  from  malignant  breast  lesions.  These  include 
acoustic  descriptors  (“echogenicity,”  "heterogeneity,”  “shadowing”)  and  morphometric  descriptors 
(“area,”  “aspect  ratio,”  “border  irregularity,”  “margin  definition”).  These  quantitative  descriptors  are 
designed  to  be  independent  of  instrument  properties  and  physician  expertise.  Our  analysis  included 
manual  tracing  of  lesion  boundaries  and  adjacent  areas  on  grayscale  images  generated  from  RF  data. 

To  derive  quantitative  acoustic  features,  we  computed  spectral-parameter  maps  of  radio-frequency 
(RF)  echo  signals  (using  a  sliding-window  Fourier  analysis)  of  the  lesion  and  adjacent  areas.  We 
quantified  morphometric  features  by  geometric  and  fractal  analysis  of  traced  lesion  boundaries. 
Although  no  single  parameter  can  reliably  discriminate  cancerous  from  non-cancerous  breast  lesions, 
multi-feature  analysis  provides  excellent  discrimination  of  cancerous  and  non-cancerous  lesions.  Our 
analysis  of  data  acquired  during  routine  ultrasonic  examination  of  130  biopsy-scheduled  patients 
produced  a  receiver-operating  characteristic  (ROC)  area  under  the  curve  (AUC)  of  0.947±0.045. 
Lesion-margin  definition,  spiculation,  and  border  irregularity  were  the  most  useful  among  the 
quantitative  descriptors;  some  morphometric  features  (such  as  border  irregularity)  also  were 
particularly  effective  in  lesion  classification.  Our  results  are  consistent  with  many  of  the  Breast 
Imaging  Reporting  and  Data  System  (Bl-RADS)  breast-lesion-classification  criteria  in  use  today. 

Keywords:  Breast  diseases,  breast  cancer,  computer-aided  diagnosis  (CAD),  fractal  analysis,  morphometric 
analysis,  multi-feature  analysis,  receiver-operating  characteristics  (ROC),  sonography,  spectrum  analysis, 
texture  analysis,  tissue  characterization,  tumor  classification,  ultrasonic  imaging,  ultrasound. 

1.  BACKGROUND  AND  INTRODUCTION 

Breast  cancer  affects  one  of  every  eight  women,  it  kills  one  of  29  women  in  the  United  States,  and  is  the 
leading  cause  of  death  in  women  in  developed  countries  [1,2].  An  estimated  207,090  new  cases  of  breast 
cancer,  and  39,840  deaths,  are  expected  among  women  in  the  US  in  2010  [3],  Survival  rates  for 
advanced-stage  breast  cancers  have  improved  significantly  and  early-stage  breast  cancers  are  now 
virtually  curable  [4],  Consequently,  early  detection  can  play  a  crucial  role  in  a  patient’s  survival. 

Of  the  breast  biopsies  (annually  around  1.7  million,  according  to  National  Cancer  Institute  estimate) 
performed  in  the  US,  70-90%  are  benign  [5],  A  method  that  reliably  identifies  benign  lesions  (with 
virtually  zero  false  negatives)  would  prevent  many  unneeded  biopsies,  which  are  expensive  and,  as  in  any 
surgical  procedure,  involve  minor  risks.  Assuming  the  average  cost  of  a  biopsy  procedure  to  be  $2,500, 
even  a  10%  reduction  in  biopsies  (170,000  biopsies)  would  result  in  a  saving  of  almost  a  half  billion 
dollars  a  year  in  the  US.  (In  fact,  the  more  common  surgical  biopsies  cost  $2,500-$5,000,  whereas  needle 
biopsies  cost  $750-$  1,200.)  Furthermore,  unneeded  biopsies  impose  needless  risk  of  complications,  incur 
additional  health-care  costs,  and  needlessly  heighten  patient  anxiety  (e.g.,  while  awaiting  pathology 
results). 

Unlike  some  other  cancer  types,  most  breast  cancers  are  visible  in  B-mode  ultrasound  images.  Advances 
in  ultrasonic  imaging  technology  allow  detailed  examination  of  breast-tumor  characteristics.  Although  no 
single  B-mode  feature  has  been  found  to  be  a  reliable  identifier  of  malignancy,  recent  clinical  studies 
have  shown  that  a  combination  of  selected  B-mode  features  can  be  effective  for  breast  cancer 
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Ultrasonic  feature  analysis  for  breast  lesion  diagnosis 


(a)  (b) 


Fig.  1:  (a)  Malignant  lesion  (in  situ  and  invasive  ductal  carcinoma):  the  lesion  has  irregular  multilobular  shape,  “tall”  aspect 
ratio,  heterogeneous  internal  texture,  poorly  defined  margin,  and  a  prominent  posterior  shadow,  (b)  Benign  lesion 
(fibroadenoma):  the  lesion  has  the  classical  near- spherical  shape,  a  smooth  boundary,  clearly-defined  margin,  homogeneous 
internal  texture,  and  a  posterior  “anti-shadow”  or  enhancement.  (Note  the  edge  shadows  due  to  refractive  effects.) 

identification.  [5,7-9]  The  American  College  of  Radiology  (ACR)  developed  the  Breast  Imaging 
Reporting  and  Data  System  (Bl-RADS)  lexicon  for  features  describing  the  ultrasound  appearance  of 
breast  lesions  to  improve  the  accuracy  of  breast  ultrasound  diagnosis  [10,1 1].  BI-RADS  defines  six 
different  possible  findings  (Category  0  to  5).  Category  0  indicates  that  assessment  is  incomplete, 
additional  imaging  evaluation  necessary,  whereas  Category  1  lesions  are  virtually  certainly 
benign  while  Category  5  lesions  have  features  that  are  highly  suggestive  of  malignancy,  i.e.,  the 
likelihood  of  malignancy  increases  from  virtually  zero  in  Category  1  to  virtually  certain  in  Category  5. 
Several  studies  have  reported  encouraging  results  from  automated  quantitative  analysis  employing  single 
[12-16]  as  well  as  multiple  [17]  features  using  data  from  modem  ultrasonic  scanners.  This  list  is  not 
exhaustive  and  many  other  groups  reported  results  for  automated  methods  of  breast-cancer  identification, 
although  some  studies  ignored  and  did  not  compensate  for  the  contribution  from  the  ultrasound  scanning 
system. 

We  implemented  a  quantitative  multi-feature-analysis  procedure  that  uses  the  Bl-RADS  criteria  currently 
employed  subjectively  by  clinicians  using  acoustic  as  well  as  morphometric  features.  The  acoustic  features 
include  measures  of  lesion  echogenicity,  heterogeneity,  and  central  shadowing,  based  on  spectrum  analysis  of 
RF  echoes  [18].  The  morphometric  features  include  area,  location,  aspect  ratio,  and  boundary  roughness  of 
the  lesions.  We  employed  hybrid  features  that  use  combined  acoustic  and  border  information,  e.g.,  margin 
definition.  Here  we  provide  a  brief  report  of  our  findings.  We  previously  reported  preliminary  results  for  this 
study  in  conference  proceedings  [19,20],  We  also  published  a  detailed  report  of  our  findings  in  a  journal 
paper  [21], 

2.  METHODS 

Diagnostically-usefiil  lesion  characteristics  investigated  in  our  study  include  features  based  on  acoustic 
properties  (acoustic  features)  as  well  as  on  their  shapes  or  boundaries  (moiphometric  features).  The 
features  found  to  be  the  most  useful  in  the  multi-feature  studies  are  listed  below  in  Table  I.  The  following 
features  are  the  most-important  ones  for  distinguishing  cancerous  from  non-cancerous  lesions:  internal 
texture  (heterogeneous  vs.  homogeneous),  central  shadow  (shadow  vs.  enhancement),  shape  (irregular  vs. 
regular),  aspect  ratio  (height  divided  by  width)  with  respect  to  the  duct  axis  (greater  than  unity  vs.  less 
than  unity),  border  quality  (irregular  vs.  regular),  and  margin  definition  (poorly  defined  vs.  well  defined). 
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Figure  1  presents  ultrasound  grayscale  image  examples  of  a  benign  lesion  and  a  malignant  lesion  with 
many  of  the  typical  characteristics  of  each  lesion  type.  The  malignant  lesion  exhibits  heterogeneous 
internal  texture,  a  central  shadow,  a  poorly  defined  margin,  an  irregular  shape,  and  a  “tall”  aspect  ratio;  all 


Table  I:  Features  of  conventional  B-mode  images  associated  with  malignant  and 
benign  lesions.  A  typical  lesion  may  have  only  a  subset  of  these  identifying  features. 


Malignant  lesions 

Benign  lesions 

Internal  features 

Morphometric  features 

Internal  features 

Morphometric  features 

Central  shadowing 

Irregular  shape/spiculation 

Edge 

shadowing/enhancement 

Spherical/ovoid  shape 

Hypoechogenicity 

Poorly  defined  margin 

Hyperechogenicity 

Linear  well-defined  margin 

Heterogeneous  texture 

Tall  aspect  ratio 

Homogeneous  texture 

Thin  capsule 

Calcifications 

Microlobulation 

Gentle  bi-  or  trilobulations 

Architectural  distortion 

Orientation  parallel  to  tissue 
plane 

these  features  are  typical  of  malignant  lesions.  In  contrast,  the  benign  lesion  exhibits  a  central  posterior 
enhancement  (sometimes  referred  to  as  “anti-shadow”),  a  homogeneous  internal  texture,  a  clearly  defined 
margin,  a  smooth  shape,  and  an  aspect  ratio  of  less  than  unity  typical  of  benign  lesions.  However,  both 
lesions  are  hypoechoic. 

Successful  classification  using  these  qualitative  image  characteristics  is  invariably  dependent  on  clinician 
skills.  Our  research  addresses  the  development  of  quantitative  descriptors  to  provide  operator-independent 
lesion  identification.  Quantitative  descriptors  will  also  increase  reliability  of  lesion  identification  and  may 
allow  identification  of  smaller  lesions.  We  first  identified  features  that  lend  themselves  to  quantification; 
not  all  subjective  features  are  reliably  quantifiable.  The  procedure  to  quantify  acoustic  and  morphometric 
features  is  described  below. 

2.1  Data  Acquisition 

The  data  for  this  IRB-approved  study  consisted  of  RF  echo-signal  data  that  were  digitized  from  breast 
lesions  before  any  non-linear  processing  such  as  compression  or  envelope  detection  occurred  within  the 
scanner.  These  data  were  acquired  from  130  patients  during  routine  ultrasonic  examinations  that  occurred 
prior  to  a  scheduled  biopsy.  Subsequent  biopsies  of  examined  lesions  determined  that  104  patients  had 
benign  masses  and  26  had  malignant  masses.  These  biopsy  and  RF  data  were  acquired  at  the  following 
three  clinical  sites:  Thomas  Jefferson  University,  the  University  of  Cincinnati,  and  Yale  University.  These 
patients  had  undergone  mammography  prior  to  the  ultrasound  examinations  and  had  mammographically 
visible  lesions.  The  following  exclusion  criteria  were  applied:  age  less  than  18  years  (due  to  legal  consent 
limitations),  prior  breast  carcinoma,  biopsy  or  mastectomy,  breast  implant,  simple  cyst,  pregnancy, 
microcalcifications  not  associated  with  a  mass  on  sonography,  and  male  or  transsexual  gender.  Masses 
were  examined  with  the  patient  in  the  standard  supine  position  by  an  experienced  radiologist  or 
sonographer  using  a  Philips  Ultrasound  (Bothell,  WA)  UM-9  HD1  scanner.  An  LI 0-5  (7.5  MHz)  linear 
array  transducer  was  employed  at  a  default  (constant)  power  level  and  a  single  transmit  focal  length 
selected  by  the  operator.  Standard  ultrasonic  breast-examination  procedures  were  employed.  Data  were 
sampled  at  20  MHz  at  an  effective  dynamic  range  of  14  bits.  Time  Gain  Control  (TGC)  data  were 
acquired  for  every  scan,  and  RF  data  were  corrected  for  TGC  before  processing. 
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2.2  Analysis  Procedures 

We  manually  demarcated  a  set  of 
analysis  regions  on  B-mode  images 
generated  from  the  RF  data  and 
analyzed  each  region  to  compute 
the  quantitative  features  described 
below.  The  benign  and  malignant 
lesions  of  Fig.  1  are  shown  again  in 
Fig.  2  with  traces  of  the  nine 
analysis  regions  superimposed. 

With  respect  to  the  tumor,  these 
regions  are:  left-anterior,  tumor- 
anterior,  right-anterior,  left-lateral, 
tumor,  right-lateral,  left-posterior, 
tumor-posterior,  and  right- 
posterior.  As  discussed  below,  we  needed  only  the  tumor  trace  for  the  majority  of  quantitative  features. 
Flowever,  we  needed  the  tumor-posterior  region  as  well  as  the  left  and  right-posterior  regions  for  shadow 
measurements  and  all  analysis  regions  except  the  anterior  regions  for  computing  relative  absoiption.  Most 
lesions  were  scanned  in  multiple  (but  at  least  two  orthogonal)  scan  planes,  thereby  providing  redundant 
data  for  a  given  lesion.  We  averaged  each  quantitative  feature  value  for  multiple  scans  of  a  specific  lesion 
to  arrive  at  a  single  number.  All  processing  software  was  written  in  MATLAB™  (The  Mathworks,  Inc., 
Natick,  MA),  except  for  the  tumor-tracing  program,  which  was  written  in  Visual  Basic™  (Microsoft 
Corporation,  Redmond,  WA). 

2.3  Acoustic  Features 

We  defined  quantitative 
acoustic  features  in  terms  of 
calibrated  spectrum  analysis 
[18]  parameters.  Spectrum 
analysis  involves  several  steps. 

First,  a  Flamming  window  is 
applied  to  RF  data,  a  power 
spectrum  is  computed  from  the 
Fourier  transform  of  the 
windowed  data  segment,  and 
the  resultant  power  spectrum  is 
converted  to  dB.  Next,  system 
and  diffraction  effects :  are 
subtracted  from  the  computed 
spectrum  to  derive  the  desired 
tissue  spectrum.  Finally,  the 


B-scan  image 


Power  spectrum 


Frequency  (MHz) 

(a)  (b) 

Fig.  3:  Illustration  of  the  Spectrum  Analysis  procedure.  Calibrated  power 
spectrum  of  windowed  (typically  Hamming,  of  length  L)  RF  data  is 
evaluated.  A  linear  regression  line  through  the  power  spectrum  is 
computed.  In  this  example,  M  is  the  midband  value  (value  of  the 
regression  line  at  center  frequency  f0)  and  1  is  the  spectral  intercept 
(value  of  the  regression  line  extrapolated  to /=  0). 


(a)  (b) 

Fig.  2:  B-scan  images  of  Fig.  1  with  superimposed  analysis-region  traces, 
(a)  Malignant  lesion,  (b)  Benign  lesion. 


f  Measured  spectra  depend  not  only  on  tissue  properties,  but  also  on  1 )  the  combined  two-way  transfer  function  of  the  transducer 
and  the  ultrasonic-system  electronic  modules,  2)  the  two-way  range-dependent  diffraction  function  (beam  properties),  and  3) 
acoustic  attenuation.  Corrections  for  the  first  two  functions  involve  experimental  data  obtained  at  each  transmit  focal  length.  The 
electronic  transfer  function  was  estimated  using  a  planar  reflection  method  from  RF  data  acquired  from  the  front  planar  surface 
of  an  RTV  silicone  block  in  a  water  bath.  The  diffraction  function  was  estimated  using  a  reference  phantom  method  from  data 
obtained  by  scanning  a  rubber  block  containing  a  diffuse  suspension  of  10-pm  diameter  glass  spheres.  Then,  breast  tissue  spectral 
parameters  were  estimated  by  subtracting  the  contribution  from  transfer  function  and  diffraction.  Finally,  an  empirical  attenuation 
coefficient  was  used  to  correct  for  attenuation  as  described  further  below.  The  system  effects  are  analyzed  in  detail  in  other 
theoretical  papers  [18,23,24], 


4 


Bangladesh  Journal  of  Medical  Physics 


Vol.  4,  No.l,  2011 


computed  spectrum  is  analyzed  with  linear  regression  techniques  applied  over  the  bandwidth  of  the 
signal;  the  primary  parameters  of  interest  are  the  slope  of  the  regression  line  (SLP,  s),  its  value  at 
midpoint  of  signal  bandwidth  (MBF,  M)  and  its  intercept  at  zero  frequency  (1NT,  /).  Images  of  these 
parameters  are  created  by  progressively  sliding  the  Hamming  window  over  all  RF  data  and  repeating  the 
above  sequence.  The  spectrum  analysis  procedure  is  illustrated  in  Fig.  3. 

In  the  absence  of  attenuation  in  the  intervening  media,  the  linear  regression  line  through  the  power 
spectrum  can  be  written  as,  P(f  )  =  I  +  sf ,  where  I  is  spectral  intercept,  m  is  slope,  and  /  is  frequency. 
Thus,  the  midband  fit,  M  =  I  +  sf0,  f0  being  the  center  frequency.  In  the  presence  of  attenuation  in 
intervening  tissue,  the  linear  regression  line  through  the  power  spectrum  is  P  (/)  =  P(  f)  -  2 adf  =  I  +  (s  - 
2 ad)f  where  a  is  the  effective  attenuation  coefficient  (dB/MHz-cm)  and  d  is  the  depth  of  intervening 
tissue.  Hence,  spectral  intercept  Ia  =  I,  midband  fit  Ma  =  1  +  (s  -  2 ad)f0,  and  slope  sa  =  (s  -2  ad).  Thus, 
the  presence  of  attenuation  affects  slope  and  midband  fit,  but  not  intercept.  The  necessary  assumption  is 


(a)  B-scan  image 


(c)  /  within  lesion 


(b)  Image  of/ 


(a)  B-scan  image 


(d)  Flistogram  of  I  (c)  Min  theROIs 


300 


200 


^t.O  20.7  -10.1 


(b)  Image  of  M 


Lateral  position  (nun) 

Fig.  4:  Echogenicity  for  data  corresponding  to  Fig.  3(a).  Fig.  5:  Shadowing  for  data  corresponding  to  Fig.  3(a). 
According  to  our  definition,  echogenicity  is  -43.2  dB.  According  to  our  definition,  shadowing  is  -13.65  dB. 


that  attenuation  (in  dB)  varies  linearly  with  frequency.  Although  this  assumption  is  only  approximate,  the 
conclusion  about  the  invariance  of  intercept  in  the  presence  of  tissue  attenuation  has  proved  to  be  accurate 
in  our  experience. 

In  our  work,  the  following  definitions  are  employed  to  provide  quantitative  assays  qualitative  (B-mode) 
acoustic  features:  window  length,  W  =  2.4  mm,  spectral  bandwidth,  B  =  4  MHz  (5-9  MHz),  and 
attenuation  coefficient,  a  =  1  dB/MHz-cm. 

A)  Echogenicity  is  defined  as  the  mean  value  of  spectral  intercept,  juIl  ,  within  the  lesion.  Since  spectral 

intercept  is  largely  independent  of  frequency-dependent  attenuation  in  the  intervening  media,  no 
attenuation  correction  is  necessary.  Calculation  of  lesion  echogenicity  is  illustrated  in  Fig.  4  using  the 
example  of  invasive  ductal  carcinoma  shown  in  Fig.  la.  Figure  4a  shows  the  B-scan  image;  the 
corresponding  image  of  spectral  intercept  (/)  is  shown  in  Fig.  4b.  Figure  4c  shows  the  intercept  image 
within  the  traced  lesion  boundary  and  Fig.  4d  shows  the  histogram  of  1  within  the  traced  lesion.  The 
quantitative  value  of  echogenicity  is  the  mean  value  of  1NT  within  the  lesion  (-43.2  dB).  The 
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fibroadenoma  in  Fig.  lb  has  an  echogenicity  of  -36.0  dB .Shadowing  is  defined  as  the  difference 
(normalized  by  lesion  thickness)  between  mean  M  values  in  comparable  shadowed  and  unshadowed 
regions  posterior  to  the  lesion.  (Difference  between  tumor  and  tumor-posterior  is  compared  with  the 
average  of  differences  between  left-lateral  and  left-posterior,  and  right-lateral  and  right-posterior.)  This 
difference  can  be  used  to  estimate  the  attenuation  coefficient  within  the  lesion,  which  we  used  in  the 
tissue  area  that  has  no  lesion  to  estimate  the  average  attenuation  coefficient  and  check  whether  these 
estimates  are  consistent  with  the  assumed  value.  Calculation  of  central  shadowing  is  illustrated  in  Fig.  5 
for  the  cancerous  lesion  in  Fig.  la,  casting  a  noticeable  central  shadow.  Figure  5a  shows  the  B-scan 
image;  the  corresponding  image  of  M  is  shown  in  Fig.  5b.  Figure  5c  shows  the  MBF  image  inside  the 
tumor  as  well  as  in  the  posterior  regions.  Mean  values  of  MBF  in  the  lateral  posterior  regions  are  -4.0  and 
-10.1  dB  (their  mean  being  -7.05  dB).  The  mean  of  MBF  posterior  to  the  lesion  is  -20.7  dB,  which  is 
13.65  dB  lower.  Thus,  this  tumor  casts  a  13.65-dB  central  shadow.  Figure  5d  separately  plots  the  vertical 
thickness  of  the  tumor  and  mean-value  of  M  in  the  posterior  region  vs.  lateral  position.  The  mean-value  of 
M  decreases  with  increasing  lesion  thickness  and  is  the  lowest  at  the  thickest  point  of  the  tumor.  The 
fibroadenoma  in  Fig.  lb  has  caused  a  central  posterior  enhancement  of +8.35  dB. 

B)  Relative  absorption  is  a  composite  feature  and  is  defined  as  ra  =  [Mpn  -Man)/dl  -  [m p]  -Mal)/d2  [22], 

where  Mai  is  the  mean  of  midband  fit  inside  the  lesion,  Mpt  is  the  mean  of  midband  fit  posterior  to  lesion, 
Ma„  is  the  mean  of  midband  fit  in  normal  tissue  next  to  the  lesion,  Mpn  is  the  mean  of  midband  fit  in 
normal  tissue  lateral  posterior  to  next  to  the  lesion,  cf  is  the  spatial  distance  between  the  centroids  of  Mpn 
and  Man,  and  d2  is  the  spatial  distance  between  the  centroids  of  Mpy  Mah  Mpn,  and  Man  can  be  averaged 
for  left  and  right  lateral  regions.  Relative  absoiption  value  for  the  malignant  lesion  in  Fig.  la  is  -0.12  dB. 
In  contrast,  the  fibroadenoma  in  Fig.  lb  has  a  relative  absoiption  of -0.47  dB. 

C)  We  can  define  heterogeneity’  in  several  ways.  It  can  be  defined  as  the  standard  deviation  of  midband  fit 
values,  <jM  ,  within  the  lesion  [23]  and  we  can  assess  the  heterogeneity  of  the  lesion  by  comparing  crM/ 

with  (7M  for  a  homogeneous  region.  In  homogeneous  tissue  regions,  the  standard  deviations  of  M,  s,  and  /, 
can  be  expressed  as  [23]  au  =  5.6 /-JbW  ,  a  =  5. 6yfl2 / (B^l BW ) ,  and  a,  =  +  /0cr  ,  respectively,  where 

/o  is  the  center  frequency  (Mffz),  B  is  the  bandwidth  (MFIz),  and  W  is  the  Hamming-window  length  (mm). 
As  tissue  becomes  more  heterogeneous,  the  standard  deviations  of  these  measured  parameters  increase 
from  the  above  theoretical  values.  We  select  aM  to  provide  an  index  of  tissue  heterogeneity  because  M 
typically  provides  less-noisy  estimates  compared  to  the  other  two  spectral  parameters;  this  permits  smaller 
departures  from  homogeneity  to  be  detected.  The  standard  deviation  of  MBF  inside  the  lesion  for  the 
invasive  ductal  carcinoma  in  situ  in  Fig.  la  is  5.6  dB,  whereas  it  should  be  1.77  dB  for  a  homogeneous 
region.  (For  our  processing  parameters,  L  =  2.5  mm,/)  =  7.5  Mffz,  and  B  =  4  MHz,  aM  =  1-77,  as  =  1.53, 
and  Gi  =  4.55.)  In  contrast,  oM  for  the  fibroadenoma  in  Fig.  lb  is  4.4  dB. 

Heterogeneity  also  may  depend  on  texture  and  uM  contains  no  textural  information.  Therefore,  we 
defined  heterogeneity  in  term  of  texture  of  midband  fit  inside  the  lesion;  texture  was  defined  in  terms  of  a 
four-neighborhood  pixel  algorithm  [25]  (FNPA)  and  Hurst  Coefficient  fractal  dimension  measure  [26,27]. 
(The  cooccurrence  matrix  has  also  been  used  to  estimate  B-mode  texture  [28],  but  the  calculation  cost  of 
the  co-occurrence  matrix  is  high.)  FNPA  yields  -0.03  dB  for  the  cancer  in  Fig.  la  and  -0.04  dB  for  the 
lesion  in  Fig.  lb. 

2.4  Morphometric  Features 

Invasive  ductal  carcinomas  generally  have  “fuzzy”  borders  due  to  their  invasive  margins.  Cancers  that 
have  little  desmoplastic  reaction  (proliferation  of  fibroblasts)  typically  have  clear  margins,  but  still  are 
highly  irregular  in  shape.  Chou  et  al  [15]  demonstrated  good  performance  (97.2%  sensitivity  and  94.1% 
negative  predictive  value)  using  only  quantitative  lesion-shape  features  describing  irregular  boundaries 
(or  deviation  from  a  smooth  shape).  We  have  employed  the  following  definitions  for  the  quantitative 
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morphometric  descriptors  that  are  related  to  the 
shape  or  border  of  the  lesion.  All  morphometric 
features  have  been  computed  using  lesion 
boundaries  traced  on  B-mode  images. 

A)  Area  is  defined  as  the  total  lesion  area  in 
square  cm.  Lesion  area  for  the  cancer  in  Fig.  la  is 
0.73  cm2.  Lesion  area  has  not  been  found  to  be  a 
reliable  feature  for  lesion  classification. 


(a)  Ovoid  (b)  Irregular  border 


B)  Aspect  Ratio  is  defined  as  the  maximum  vertical 
lesion-dimension  divided  by  maximum  horizontal 
lesion-dimension.  The  aspect  ratio  (height  divided  by 
width)  often  exceeds  0.8  in  breast  carcinomas  for 
small  lesions  [29].  In  larger  carcinomas,  this  criterion 
is  less  useful  due  to  their  more  irregular  shapes  and 
growth  along  duct  axes.  The  lesion  aspect  ratio  is 
0.97  for  the  cancer  in  Fig.  la  and  0.75  for  the  benign 
lesion  (smaller)  in  Fig.  lb. 

C)  We  define  border  irregularity  in  terms  of  a 
fractal  dimension.  Fractal  dimension  of  a  closed 
contour  can  be  used  to  represent  its  border  roughness  [30].  Mandelbrot  [31]  has  investigated  the  fractal 
dimensions  of  geographical  boundaries.  On  the  other  hand,  convexity  [32]  (ratio  between  convex 
perimeter  and  actual  lesion  perimeter)  also  can  be  used  to  express  border  irregularity;  this  is  an  excellent 
descriptor  of  spiculation.  We  illustrate  using  four  contours  of  varying  border  roughness  in  Fig.  6.  The 
smooth  ovoid  in  Fig.  6a  has  a  low  fractal  dimension  (1.01)  and  convexity  of  unity.  For  the  non-spiculative 
rough  border  depicted  in  Fig.  6b,  fractal  dimension  is  1.05  and  convexity  is  0.86.  For  the  mild  spiculation 
in  Fig.  6c,  fractal  dimension  is  1.06  whereas  convexity  is  0.84.  For  the  moderate  spiculation  in  Fig.  6d, 
the  fractal  dimension  is  higher  (1.18)  and  the  convexity  is  much  lower  (0.51).  We  note  that  further 
increase  in  spiculation  drastically  reduces  convexity;  thus,  convexity  is  rather  adept  at  identifying 
spiculation  and  also  an  excellent  descriptor  of  border  irregularity.  Fractal  dimension  is  also  an  excellent 
quantitative  descriptor  of  border  irregularity.  For  the  malignant  lesion  in  Fig.  la,  the  fractal  dimension  and 
convexity  are  1.13  and  0.90,  respectively,  and  for  the  benign  lesion  in  Fig.  lb,  they  are  1.03  (lower)  and 
0.99  (higher),  respectively. 


(c)  Sp  iculation  (mild)  (d)  Sp  iculation 


Fig.  6:  Illustration  of  border  roughness. 


D)  We  have  defined  margin  definition  as  the  sum  of  magnitude  of  gradient  of  midband  fit  on  a  lesion 
contour  normalized  by  the  sum  of  magnitude  of  the  gradient  of  midband  fit  on  the  lesion  contour. 
(Normalization  is  required  to  remove  dependence  on  contour  length  as  well  as  magnitudes  of  midband  fit 
values.)  Because  this  feature  uses  both  the  lesion  contour  as  well  as  a  spectral  parameter,  it  really  is  a 
hybrid  feature.  We  use  midband  fit  instead  of  envelope  of  RF  echoes  because  MBF  is  statistically  well- 
behaved  and  can  more  easily  be  corrected  for  system  effects  and  diffraction.  Gradient-based  margin 


definition  for  the  lesions  in  Figs,  la  and 
lb  are  0.07  and  0.10,  respectively. 

3.  RESULTS 

We  have  analyzed  data  for  130  patients  (26 
malignant  and  104  benign).  Scatter 
diagrams  for  selected  quantitative  acoustic 
features  and  morphometric  features  are 


Aspect  Ratio  FNPA(dB) 


presented  in  Fig.  7.  The  scatter  diagrams  of 


(a) 


(b) 


Fig.  7a  (margin  definition  uv.  aspect  ratio)  Fig-  7:  Scatter  diagrams  of  selected  lesion  features. 
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and  7b  (fractal  dimension  vs.  lesion  texture)  exhibit  fairly  clear  separation  between  benign  and  malignant 
cases.  With  respect  to  non-cancers,  the  cancer  cases  exhibit  poorer  margin  definition,  larger  aspect  ratio, 
higher  texture,  and  higher  border  irregularity.  We  drew  a  straight  line  through  each  scatter  diagram  such  that 
all  the  malignant  lesions  are  on  the  right  of  the  lesion.  Thus,  there  only  are  benign  lesions  on  the  left  of  each 
line.  If  we  want  to  be  even  more  conservative  and  further  reduce  the  possibility  of  a  false  negative,  we  could 
more  the  line  more  to  the  left. 

We  performed  an  independent-samples  t-test  to  assess  whether  the  means  of  different  parameters  are 
statistically  different  for  benign  and  malignant  cases.  We  found  that  FNPA,  Hurst  Coefficient,  Margin 
Definition,  Aspect  Ratio,  Solidity,  Convexity,  and  Hausdorf  Dimension  are  significantly  different  for 
benign  and  malignant  cases.  Additional  details  about 
the  t-test  are  in  [21].  Because  these  features  show  fairly 
clear  separation  between  cancers  and  non  cancers,  a 
linear  classification  approach  is  indicated.  We  used 
logistic  regression  (LR)  for  our  classification  analysis. 

All  statistical  analysis  were  performed  using  SPSS” 

(SPSS  Inc.,  Chicago,  1L)  using  all  quantitative  features. 

Out  of  130  patients,  121  were  used;  the  other  9  had  at 
least  one  quantitative  feature  missing.  (For  example, 
shadowing  cannot  be  computed  if  posterior  regions 
cannot  be  traced.)  The  classifier  incorporated 
heterogeneity,  margin  definition,  fractal  dimension,  and 
convexity  using  Wilks’  Lambda  Stepwise  Statistics. 

Classification  performance  was  assessed  using  an  ROC 
analysis33]  ],  which  plotted  true -positive  fraction  (TPF) 
or  sensitivity  vs.  false-positive  fraction  (FPF)  or  1 
minus  specificity.  TPF  and  FPF  are  defined  as: 


FPF 

Fig.  8:  ROC  curve  for  multi-feature  analysis.  ROC 
Area:  0.9164  ±0.0346. 


TPF  = 


Number  of  correctly  identified  malignant  lesions 
Total  number  of  malignant  lesions 


and 


FPF  =  1  - 


Number  of  correctly  identified  benign  lesions 
Total  number  of  benign  lesions 


(5) 


Incorporation  of  all  four  parameters  in  an  LR  produced  an  area  under  the  ROC  curve  (AUC)  of  0.947±0.045 
(Fig.  8).  We  do  not  report  sensitivity  or  specificity  because  these  depend  on  the  chosen  operating  point  in 
the  ROC  curve.  Best  methods  achieve  high  TPF  values,  i.e.,  high  sensitivity  values,  for  concurrently  low 
FPF  values,  i.e.,  high  specificity  values. 


4.  CONCLUSION 

Many  radiologists  now  use  breast  ultrasound  findings  based  in  BI-RADS  criteria,  to  recommend  periodic 
follow-up  without  a  biopsy.  Whereas  an  expert  clinician  might  be  able  to  identify  benign  lesions  accurately, 
a  non-expert  might  misidentify  ambiguous  cases.  We  studied  the  performance  of  classification  using 
quantitative  features  to  determine  whether  such  an  approach  might  be  of  value  in  reducing  misdiagnoses  by 
less-expert  readers.  Our  results  suggest  that  an  automated  or  semi-automated  procedure  might  be  able  to 
assist  a  physician  in  making  diagnosis.  Clearly,  the  method  is  not  foolproof  (ROC  Area  f  1.0).  Due  to  the 
current  legal  environment  in  the  western  countries  (particularly  in  the  US),  the  radiologists  tend  to  be 
conservative.  Our  results  suggest  that  we  can  achieve  a  near-perfect  NPV  (negative -predictive  value)  if  we 
design  the  procedure  to  be  conservative  by  operating  in  the  low  FPF  area  (e.g.,  where  %  <  TPF  <  Vf).  This, 
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however,  will  mean  a  decreased  number  of  avoided  biopsies.  The  current  procedure  remains  semi- 
quantitative  as  analysis  regions  are  manually  traced.  Our  recent  work  on  automated  boundary  detection 
suggests  that  this  method  can  be  truly  operator  independent  with  automated  lesion  boundary  segmentation. 
Note  that  manual  lesion  segmentation  was  performed  by  knowledgeable  non-clinicians  in  our  study.  Thus, 
the  method  does  not  depend  on  a  radiologist’s  expertise  for  precise  delineation  of  tumor  borders.  We  will 
investigate  the  role  of  additional  factors  e.g.,  age,  body-mass  index  or  equivalent,  etc.  Furthermore, 
breast  composition  varies  from  person  to  person,  which  seems  likely  to  influence  RF  echoes.  Thus,  for 
acoustic  features  in  particular,  the  presumably  normal,  opposite  (contralateral)  breast  tissue  may  provide  a 
baseline  to  compare  with  the  lesion-containing  breast  [34].  Because  tissue  properties  change  with  time  and 
differ  among  patients,  such  an  approach  would  compensate  for  variations  in  breast  density,  time  of  the 
menstrual  cycle,  changes  occurring  at  menopause,  fat  content,  age,  etc.  in  future  analyses.  Furthermore,  our 
study  data  did  not  distinguish  among  fibrous,  glandular,  and  fatty  breasts.  This  may  be  important  because 
breast  composition  can  affect  shadow  values. 

Because  their  breasts  typically  are  radiologically-dense,  X-ray  mammography  tends  to  be  relatively 
ineffective  for  young  women.  The  sensitivity  of  X-ray  mammography  is  significantly  less  (60%)  in 
younger  women  (less  than  50  years  old)  compared  to  women  older  than  50  (86%)  [35],  However,  breast 
tumors  tend  to  grow  faster  in  younger,  estrogen-rich  women  [36].  Thus,  early  detection  may  be  even  more 
critical  for  the  survival  of  younger  women  with  breast  cancer,  where  ultrasound  can  play  an  important 
role. 
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